Characterizing Visitors to a Website Across Multiple Sessions
نویسندگان
چکیده
Characterizing web users based on their interactions with a particular website is a key problem in web mining. Such interactions are reflected by clickstream data in weblogs, the content of the pages that they view, as well as their actions such as search queries typed in. In general a wide variety of behaviors is observed at popular websites. So, to make the problem more tractable, it is often wise to first group similar users into relatively homogeneous segments and then try to characterize each such segment. There has been much work done on clustering visitors at a site, but typically the analyses use substantially simplified descriptions of session behavior, and also do not characterize users based on their footprints across multiple visits to the site. In this paper, each web user is represented by a set of sessions, and a two stage methodology for grouping such users is proposed. In the first stage, sessions are clustered into conceptual session types based on both the trajectory taken through the website as well as the time spent at each page. The distribution of a user’s multiple visits across these session types forms the basis of user grouping in the second stage. Such clustering based on a rich user description forms a platform for further characterization. Results are presented using weblogs of a popular website to illustrate the techniques.
منابع مشابه
Monitoring Web Site Usage of e-Bug: A Hygiene and Antibiotic Awareness Resource for Children
BACKGROUND e-Bug is an educational resource which teaches children and young people about microbes, hygiene, infection, and prudent antibiotic use. The e-Bug resources are available in over 22 different languages and they are used widely across the globe. The resources can be accessed from the e-Bug website. OBJECTIVE The objective of this study was to analyze the usage of the e-Bug website i...
متن کاملUsing Google Analytics for measuring inlinks effectiveness
The aim of this brief communication is to develop a tracking methodology to analyse the effectiveness of inlink visits (return visit behaviour and length of sessions). In other words, how deep do inlink visitors navigate into the website? Do all inlinks perform the same? This paper addresses these questions by time series analysis of Google Analytics data, with a methodology developed by Plaza ...
متن کاملUnderstanding Successive Searches Across Multiple Sessions Over the Web
This study intends to enhance the understanding of successive searches over multiple sessions by characterizing successive searches with a conceptual model, Multiple Information Seeking Episodes (MISE), validating MISE and supporting successive searches with a prototyped information system, PERsonalized and Successive Information Seeking Toolkit (PERSIST), whose requirements are derived from MI...
متن کاملProbabilistic Deduplication of Anonymous Web Traffic
Cookies and log in-based authentication often provide incomplete data for stitching website visitors across multiple sources, necessitating probabilistic deduplication. We address this challenge by formulating the problem as a binary classification task for pairs of anonymous visitors. We compute visitor proximity vectors by converting categorical variables like IP addresses, product search key...
متن کاملSkyServer Traffic Report - The First Five Years
The SkyServer is an Internet portal to the Sloan Digital Sky Survey Catalog Archive Server. From 2001 to 2006, there were a million visitors in 3 million sessions generating 170 million Web hits, 16 million ad-hoc SQL queries, and 65 million page views. The site currently averages 35 thousand visitors and 400 thousand sessions per month. The Web and SQL logs are public. We analyzed traffic and ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002